61 research outputs found

    Array Configuration-Agnostic Personal Voice Activity Detection Based on Spatial Coherence

    Full text link
    Personal voice activity detection has received increased attention due to the growing popularity of personal mobile devices and smart speakers. PVAD is often an integral element to speech enhancement and recognition for these applications in which lightweight signal processing is only enabled for the target user. However, in real-world scenarios, the detection performance may degrade because of competing speakers, background noise, and reverberation. To address this problem, we proposed to use equivalent rectangular bandwidth ERB-scaled spatial coherence as the input feature to train an array configuration-agnostic PVAD network. Whereas the network model requires only 112k parameters, it exhibits excellent detection performance and robustness in adverse acoustic conditions. Notably, the proposed ARCA-PVAD system is scalable to array configurations. Experimental results have demonstrated the superior performance achieved by the proposed ARCA-PVAD system over a baseline in terms of the area under receiver operating characteristic curve and equal error rate.Comment: Accepted by INTER-NOISE 2023. arXiv admin note: text overlap with arXiv:2211.0874

    A Wearable Indoor Navigation System for Blind and Visually Impaired Individuals

    Get PDF
    Indoor positioning and navigation for blind and visually impaired individuals has become an active field of research. The development of a reliable positioning and navigational system will reduce the suffering of the people with visual disabilities, help them live more independently, and promote their employment opportunities. In this work, a coarse-to-fine multi-resolution model is proposed for indoor navigation in hallway environments based on the use of a wearable computer called the eButton. This self-constructed device contains multiple sensors which are used for indoor positioning and localization in three layers of resolution: a global positioning system (GPS) layer for building identification; a Wi-Fi - barometer layer for rough position localization; and a digital camera - motion sensor layer for precise localization. In this multi-resolution model, a new theoretical framework is developed which uses the change of atmospheric pressure to determine the floor number in a multistory building. The digital camera and motion sensors within the eButton acquire both pictorial and motion data as a person with a normal vision walks along a hallway to establish a database. Precise indoor positioning and localization information is provided to the visually impaired individual based on a Kalman filter fusion algorithm and an automatic matching algorithm between the acquired images and those in the pre-established database. Motion calculation is based on the data from motion sensors is used to refine the localization result. Experiments were conducted to evaluate the performance of the algorithms. Our results show that the new device and algorithms can precisely determine the floor level and indoor location along hallways in multistory buildings, providing a powerful and unobtrusive navigational tool for blind and visually impaired individuals

    Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

    Full text link
    Recent research advances in deep neural network (DNN)-based beamformers have shown great promise for speech enhancement under adverse acoustic conditions. Different network architectures and input features have been explored in estimating beamforming weights. In this paper, we propose a deep beamformer based on an efficient convolutional recurrent network (CRN) trained with a novel ARray RespOnse-aWare (ARROW) loss function. The ARROW loss exploits the array responses of the target and interferer by using the ground truth relative transfer functions (RTFs). The DNN-based beamforming system, trained with ARROW loss through supervised learning, is able to perform speech enhancement and speaker localization jointly. Experimental results have shown that the proposed deep beamformer, trained with the linearly weighted scale-invariant source-to-noise ratio (SI-SNR) and ARROW loss functions, achieves superior performance in speech enhancement and speaker localization compared to two baselines.Comment: 6 page

    The suppression of Finite Size Effect within a Few Lattices

    Full text link
    Boundary modes localized on the boundaries of a finite-size lattice experience a finite size effect (FSE) that could result in unwanted couplings, crosstalks and formation of gaps even in topological boundary modes. It is commonly believed that the FSE decays exponentially with the size of the system and thus requires many lattices before eventually becoming negligibly small. Here we identify a special type of FSE of some boundary modes that apparently vanishes at some particular wave vectors along the boundary. Meanwhile, the number of wave vectors where the FSE vanishes equals the number of lattices across the strip. We analytically prove this type of FSE in a simple model and prove this peculiar feature. We also provide a physical system consisting of a plasmonic sphere array where this FSE is present. Our work points to the possibility of almost arbitrarily tunning of the FSE, which facilitates unprecedented manipulation of the coupling strength between modes or channels such as the integration of multiple waveguides and photonic non-abelian braiding.Comment: 22 pages, 8 figure

    Binary Star Evolution in Different Environments: Filamentary, Fractal, Halo and Tidal-tail Clusters

    Full text link
    Using membership of 85 open clusters from previous studies (Pang et al. 2021a,b, 2022b; Li et al. 2021) based on Gaia DR3 data, we identify binary candidates in the color-magnitude diagram, for systems with mass ratio q > 0.4. The binary fraction is corrected for incompleteness at different distances due to the Gaia angular resolution limit. We find a decreasing binary fraction with increasing cluster age, with substantial scatter. For clusters with a total mass > 200MM_\odot, the binary fraction is independent of cluster mass. The binary fraction depends strongly on stellar density. Among four types of cluster environments, the lowest-density filamentary and fractal stellar groups have the highest mean binary fraction: 23.6% and 23.2%, respectively. The mean binary fraction in tidal-tail clusters is 20.8%, and is lowest in the densest halo-type clusters: 14.8%. We find clear evidence of early disruptions of binary stars in the cluster sample. The radial binary fraction depends strongly on the cluster-centric distance across all four types of environments, with the smallest binary fraction within the half-mass radius rhr_h, and increasing towards a few rhr_h. Only hints of mass segregation is found in the target clusters. The observed amount of mass segregation is not significant to generate a global effect inside the target clusters. We evaluate the bias of unresolved binary systems (assuming a primary mass of 1MM_\odot) in 1D tangential velocity, which is 0.1-1kms1\,\rm km\,s^{-1}. Further studies are required to characterize the internal star cluster kinematics using Gaia proper motions

    Toward highly potent cancer agents by modulating the c-2 group of the arylthioindole class of tubulin polymerization inhibitors

    Get PDF
    New arylthioindole derivatives having different cyclic substituents at position 2 of the indole were synthesized as anticancer agents. Several compounds inhibited tubulin polymerization at submicromolar concentration and inhibited cell growth at low nanomolar concentrations. Compounds 18 and 57 were superior to the previously synthesized 5. Compound 18 was exceptionally potent as an inhibitor of cell growth: it showed IC50 = 1.0 nM in MCF-7 cells, and it was uniformly active in the whole panel of cancer cells and superior to colchicine and combretastatin A-4. Compounds 18, 20, 55, and 57 were notably more potent than vinorelbine, vinblastine, and paclitaxel in the NCI/ADR-RES and Messa/Dx5 cell lines, which overexpress P-glycoprotein. Compounds 18 and 57 showed initial vascular disrupting effects in a tumor model of liver rhabdomyosarcomas at 15 mg/kg intravenous dosage. Derivative 18 showed water solubility and higher metabolic stability than 5 in human liver microsomes

    Measurement of the charge asymmetry in top-quark pair production in the lepton-plus-jets final state in pp collision data at s=8TeV\sqrt{s}=8\,\mathrm TeV{} with the ATLAS detector

    Get PDF

    Search for single production of vector-like quarks decaying into Wb in pp collisions at s=8\sqrt{s} = 8 TeV with the ATLAS detector

    Get PDF

    Measurements of top-quark pair differential cross-sections in the eμe\mu channel in pppp collisions at s=13\sqrt{s} = 13 TeV using the ATLAS detector

    Get PDF
    corecore